A General Internal Regret-Free Strategy
نویسندگان
چکیده
We study sequential decision problems where the decision maker does not observe the states of nature, but rather receives a noisy signal, whose distribution depends on the current state and on the action that she plays. We do not assume that the decision maker considers the worst-case scenario, but rather has a response correspondence, which maps distributions over signals to subjective best responses. We extend the concept of internal regret-free strategy to this setup and provide an algorithm that generates such a strategy. Journal of Economic Literature classification numbers: C61, C72, D81, D82, D83
منابع مشابه
Calibration and Internal no-Regret with Partial Monitoring
Calibrated strategies can be obtained by performing strategies that have no internal regret in some auxiliary game. Such strategies can be constructed explicitly with the use of Blackwell's approachability theorem , in an other auxiliary game. We establish the converse: a strategy that approaches a convex B-set can be derived from the construction of a calibrated strategy. We develop these tool...
متن کاملSet-valued approachability and online learning with partial monitoring
Approachability has become a standard tool in analyzing learning algorithms in the adversarial online learning setup. We develop a variant of approachability for games where there is ambiguity in the obtained reward: it belongs to a set rather than being a single vector. Using this variant we tackle the problem of approachability in games with partial monitoring and develop a simple and general...
متن کاملOnline Learning: Sufficient Statistics and the Burkholder Method
We uncover a fairly general principle in online learning: If regret can be (approximately) expressed as a function of certain “sufficient statistics” for the data sequence, then there exists a special Burkholder function that 1) can be used algorithmically to achieve the regret bound and 2) only depends on these sufficient statistics, not the entire data sequence, so that the online strategy is...
متن کاملFrom External to Internal Regret
External regret compares the performance of an online algorithm, selecting among N actions, to the performance of the best of those actions in hindsight. Internal regret compares the loss of an online algorithm to the loss of a modified online algorithm, which consistently replaces one action by another. In this paper we give a simple generic reduction that, given an algorithm for the external ...
متن کاملOnline Learning with Transductive Regret
We study online learning with the general notion of transductive regret, that is regret with modification rules applying to expert sequences (as opposed to single experts) that are representable by weighted finite-state transducers. We show how transductive regret generalizes existing notions of regret, including: (1) external regret; (2) internal regret; (3) swap regret; and (4) conditional sw...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Dynamic Games and Applications
دوره 6 شماره
صفحات -
تاریخ انتشار 2016